Multi-View Clustering and Feature Learning via Structured Sparsity

نویسندگان

  • Hua Wang
  • Feiping Nie
  • Heng Huang
چکیده

Combining information from various data sources has become an important research topic in machine learning with many scientific applications. Most previous studies employ kernels or graphs to integrate different types of features, which routinely assume one weight for one type of features. However, for many problems, the importance of features in one source to an individual cluster of data can be varied, which makes the previous approaches ineffective. In this paper, we propose a novel multi-view learning model to integrate all features and learn the weight for every feature with respect to each cluster individually via new joint structured sparsity-inducing norms. The proposed multi-view learning framework allows us not only to perform clustering tasks, but also to deal with classification tasks by an extension when the labeling knowledge is available. A new efficient algorithm is derived to solve the formulated objective with rigorous theoretical proof on its convergence. We applied our new data fusion method to five broadly used multi-view data sets for both clustering and classification. In all experimental results, our method clearly outperforms other related state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Convex Subspace Representation Learning from Multi-View Data

Learning from multi-view data is important in many applications. In this paper, we propose a novel convex subspace representation learning method for unsupervised multi-view clustering. We first formulate the subspace learning with multiple views as a joint optimization problem with a common subspace representation matrix and a group sparsity inducing norm. By exploiting the properties of dual ...

متن کامل

Multi-label Learning via Structured Decomposition and Group Sparsity

Abstract. In multi-label learning, each sample is associated with several labels. Existing works indicate that exploring correlations between labels improve the prediction performance. However, embedding the label correlations into the training process significantly increases the problem size. Moreover, the mapping of the label structure in the feature space is not clear. In this paper, we prop...

متن کامل

Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering

Calculation of near-neighbor interactions among high dimensional, irregularly distributed data points is a fundamental task to many graph-based or kernel-based machine learning algorithms and applications. Such calculations, involving large, sparse interaction matrices, expose the limitation of conventional data-and-computation reordering techniques for improving space and time locality onmoder...

متن کامل

Posterior Regularization for Structured Latent Varaible Models

We present posterior regularization, a probabilistic framework for structured, weakly supervised learning. Our framework efficiently incorporates indirect supervision via constraints on posterior distributions of probabilistic models with latent variables. Posterior regularization separates model complexity from the complexity of structural constraints it is desired to satisfy. By directly impo...

متن کامل

Unsupervised Multi-View Feature Selection via Co-Regularization

Existing unsupervised feature selection algorithms are designed to extract the most relevant subset of features that can facilitate clustering and interpretation of the obtained results. However, these techniques are not applicable in many real-world scenarios where one has an access to datasets consisting of multiple views/representations e.g. various omics profiles of the patients, medical te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013